AMENDMENT TO THE CLAIMS 

1 . (Original) A method of training a paraphrase processing system, comprising: 

receiving a cluster of related texts; 

selecting a set of text segmente firom the cluster; and 

using textual alignment to identify paraphrase relationships between text in the text 
segments in the set. 

2. (Original) The method of claim 1 wherein using textual alignment comprises: 

using statistical textual alignment to align words in the text segments in the set; and 
identifying the paraphr^e relationships based on the aligned words. 

3. (Original) The method of claim 2 wherein using textual alignment comprises: 

using statistical textual alignment to aUgn multi-word phrases in the text segments in the 
set; and 

identifying the paraphrase relationships based on the aligned multi-word phrases, 

4. (Original) The method of claim 1 wherein using textual alignment comprises: 

using heuristic word alignment to align words in the text segments in the set; and 
identifying the paraphrase relationships based on the aligned words. 

5. (Original) The method of claim 4 wherein using textual alignment comprises: 

using heuristic textual alignment to align multi-word phrases in the text segments in the 
set; and 

identifying the paraphrase relationships based on the aUgned multi-word phrases. 

6. (Original) The method of claim 1 and further comprising: 

calculating an alignment model based on the paraphrase relationships identified. 



I. (Original) The method of claim 6 and fiirther comprising: 

receiving an input text; and 

generating a paraphrase of the input text based on the aUgnment model 

8. (Original) The method of claim 1 and wherein selecting a set of text segments compiles: 

selecting text segments for the set based on a number of shared words in the text 
segments, 

9. (Currently Amended) The method of claim 1 and further comprising: 

prior to receiving a cluster, identifying the cluster of related texts^ 

10. (Original) The method of claim 9 wherein identifying a cluster comprises: 

accessing a plurality of documents; and 

identifying documents written by different authors about a common subject, as clusters of 
related documents. 

II, (Origirial) The method of claim 10 wherein selecting a text segment set comprises: 

grouping desired text segments of the related documents in each cluster into a set of 
related text segments. 

12. (Original) The method of claim 1 1 wherein identifying documents comprises; 

identijfying documents written within a predetermined time of one another. 

13. (Original) The method of claim 1 1 wherein accessing a plurality of documents comprises: 

accessing a plurality of different news articles written about a common event. 



14. (Original) The method of claim 13 wherein accessing a plurality different news articles 



comprises: 

accessing a plurality of different news articles written by different news agencies. 

15. (Original) The method of claim 14 wherein grouping desired text segments comprises: 

grouping a first predetermined number of sentences of each news article in each cluster 
into the set of related text segments. 

16. (Original) The method of claim 15 wherein selecting a set of text segments comprises: 

pairing each sentence in a given set of related text segments with each other sentence in 
the given set. 

17, (Original) A paraphrase preceding system, comprising 

a textual alignment component configured to receive a set of text segments and identify 
paraphrase relationships between words in the set of text segments based on 

alignment of the words, 

18, (Original) The paraphrase processing system of claim 17 wherein the textual alignment 
component is configured to generate an alignment model based on statistical or heuristic 
alignment of the words. 

19. (Original) The paraphrase processing system of claim 18 wherein the textual alignment 
component is configured to identify paraphrase relationships based on ahgnments of multi-word 
phrases in the set of text segments. 

20, (Original) The paraphrase processing system of claim 17 and further comprising: 

a clustering component configured to access a plurality of documents and cluster the 
documents based on a subject matter of the documents. 
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2L (Original) The parapiirase processing system of claim 20 wherein the clustering component 
is configured to cluster documents written about a same subject. 

22. (Original) The paraphrase processing system of claim 20 wherein the clustering component 
is configured to extract predetermined text segments fh)m clustered documents to form the set of 
text segments. 

23. (Original) The paraphrase processing system of claim 22 and further comprising: 

a pairing component configured to identijfy a plurality of pairs of text segments based on 
the set of text segments, 

24. (Original) The paraphrase processing system of claim 23 wherein the pairing component is 
configured to identify the plurality of pairs of text segments by pairing each text segment in a 
given set of text segments with each other text segment in the given set of text segments. 

25. (Original) The paraphrase processing system of claim 20 and further comprising; 

a data store storing the plurality of documents. 

26. (Original) The paraphrase processing system of claim 25 wherein the data store stores a 
plurality of different news articles written by different news agencies about a common event. 

27. (Original) The paraphrase processing system of claim 26 wherein the clustering component 
is configured to cluster the news articles based on a time at which the news articles were written. 

28. (Original) The paraphrase processing system of claim 27 wherein the data store is 
implemented in one or more data stores. 

29. (Original) The paraphrase processing system of claim 17 and fiirther comprising: 



a paraphrase generator, receiving a textual input and generating a paraphrase of the 
textual input based on the paraphrase relationships. 

A paraphrase processing system, comprising: 

a paraphrase generator receiving a textual input and generating a paraphrase of the textual 
input based on a paraphrase relationship received from a textual alignment 
component configured to receive a plurality of text segments and identify 
paraphrase relationships between words in the text segments based on alignment 
of the words. 



